claim follow
From Cross-Validation to SURE: Asymptotic Risk of Tuned Regularized Estimators
Adusumilli, Karun, Kasy, Maximilian, Wilson, Ashia
We derive the asymptotic risk function of regularized empirical risk minimization (ERM) estimators tuned by $n$-fold cross-validation (CV). The out-of-sample prediction loss of such estimators converges in distribution to the squared-error loss (risk function) of shrinkage estimators in the normal means model, tuned by Stein's unbiased risk estimate (SURE). This risk function provides a more fine-grained picture of predictive performance than uniform bounds on worst-case regret, which are common in learning theory: it quantifies how risk varies with the true parameter. As key intermediate steps, we show that (i) $n$-fold CV converges uniformly to SURE, and (ii) while SURE typically has multiple local minima, its global minimum is generically well separated. Well-separation ensures that uniform convergence of CV to SURE translates into convergence of the tuning parameter chosen by CV to that chosen by SURE.
- North America > United States > Pennsylvania (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Offline_Distributional_RL__NeurIPS_2021_Submission_ (6)
We give a proof in Appendix A.5. As we discuss in Appendix A.6, we can use this result to obtain First, by Lemma 3.4, we have F Then, by Lemma A.1, with probability at least 1, we have F Note that to show the claim, it suffices to show that for sufficient large, we have ( / 2) c ( s) ( s)+ ( 8 s) . The claim follows by taking the limit k!1 . We first prove a bound on the concentration of the empirical CDF to the true CDF. We proceed by bounding the two terms in the summation.
Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem
Guin, Soumyajit, Bhatnagar, Shalabh
In this paper we propose two algorithms in the tabular setting and an algorithm for the function approximation setting for the Stochastic Shortest Path (SSP) problem. SSP problems form an important class of problems in Reinforcement Learning (RL), as other types of cost-criteria in RL can be formulated in the setting of SSP. We show asymptotic almost-sure convergence for all our algorithms. We observe superior performance of our tabular algorithms compared to other well-known convergent RL algorithms. We further observe reliable performance of our function approximation algorithm compared to other algorithms in the function approximation setting.
A Proof of Theorem 3.1
In this section, we prove Theorem 3.1, which says that it suffices to the augmented state space First, we have the following lemma. Lemma A.2. W e have D Now, we prove Theorem 3.1. By Lemma A.2, we have F Theorem 3.1 follows straightforwardly from this result. Consider the same setup as in Lemma B.1. Lemma B.1, we have F Consider the same setup as in Lemma B.1.
Supplementary Materials Roadmap
In this supplementary material, we provide "full versions" of Sections 2-4 from the main submission, Fact 2.5 (Uniform bound on entries of Gaussian vector) . For g N (0, Id), g/ nullg null is identical in distribution to v . We can expand the expectation and apply Fact B.2 to get We will also need the following stability result for affine linear thresholds. Putting all of these ingredients together, we can now complete the proof of the main Lemma B.1 of By Lemma B.6 applied to the projection of f to the two-dimensional This notion is motivated by Lemma 4.4 in Section C.1 where we study the critical points We first collect some elementary consequences of closeness. Suppose < 2/ 2. If ( v In the rest of the paper we will take to be small, so Lemma 3.3 will always apply.
af5baf594e9197b43c9f26f17b205e5b-Supplemental.pdf
Supplementary Material (Appendix) When Are Solutions Connected in Deep Networks? Hence, (15) holds and the desired claim follows. Thus, by using again assumption (A1), we can apply Corollary A.1 of [ Thus, the desired claim follows from Theorem 4.1. Note that we apply Corollary A.1 of [ Thus, the 2nd condition follows from assumption (A1), and the application of Corollary A.1 is justified. Let us assume w.l.o.g. that This shows that the set of features formed by these neurons is linearly separable.
Offline_Distributional_RL__NeurIPS_2021_Submission_ (6)
We give a proof in Appendix A.5. As we discuss in Appendix A.6, we can use this result to obtain First, by Lemma 3.4, we have F Then, by Lemma A.1, with probability at least 1, we have F Note that to show the claim, it suffices to show that for sufficient large, we have ( / 2) c ( s) ( s)+ ( 8 s) . The claim follows by taking the limit k!1 . We first prove a bound on the concentration of the empirical CDF to the true CDF. We proceed by bounding the two terms in the summation.